Improving semi-supervised deep neural network for keyword search in low resource languages
نویسندگان
چکیده
In this work, we investigate how to improve semi-supervised DNN for low resource languages where the initial systems may have high error rate. We propose using semi-supervised MLP features for DNN training, and we also explore using confidence to improve semi-supervised cross entropy and sequence training. The work conducted in this paper was evaluated under the IARPA Babel program for the keyword spotting tasks. We focus on the limited condition where there are around 10 hours of supervised data for training.
منابع مشابه
The 2016 RWTH Keyword Search System for Low-Resource Languages
In this paper we describe the RWTH Aachen keyword search (KWS) system developed in the course of the IARPA Babel program. We put focus on acoustic modeling with neural networks and evaluate the full pipeline with respect to the KWS performance. At the core of this study lie multilingual bottleneck features extracted from a deep neural network trained on all 28 languages available to the project...
متن کاملRecent improvements in neural network acoustic modeling for LVCSR in low resource languages
In this paper we focus on several techniques that improve deep neural network (DNN) acoustic modeling for low-resource languages. We explore the use of different features such as, fundamental-frequency variation (FFV), tonal features, and normalization of these features for deep neural network training. Specifically we study the impact of these features in conjunction with a tonal lexicon and s...
متن کاملSemi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملUnsupervised adaptation for deep neural network using linear least square method
In this paper, we propose a novel model based adaptation for deep neural networks based on a linear least square method. Our proposed algorithm can perform unsupervised adaptation even if the auto transcripts may have 60-70% of word error rate. We evaluate our algorithm on low resource languages, from the the IARPA BABEL program, such as Assamese, Bengali, Haitian Creole, Lao and Zulu. Our expe...
متن کاملSemi-supervised training for bottle-neck feature based DNN-HMM hybrid systems
In this paper, we investigate semi-supervised training (SST) method in various state-of-the-art acoustic modeling techniques, using bottle-neck and corresponding tandem features. These techniques include subspace GMM, tanh-neuron deep neural network (DNN), and a generalized soft-maxout (p-norm) DNN. We demonstrate that SST may lead up to 2% Word Error Rate (WER) reduction using all these techni...
متن کامل